G-Tric: generating three-way synthetic datasets with triclustering solutions
نویسندگان
چکیده
منابع مشابه
Generating Realistic Synthetic Population Datasets
Modern studies of societal phenomena rely on the availability of large datasets capturing attributes and activities of synthetic, city-level, populations. For instance, in epidemiology, synthetic population datasets are necessary to study disease propagation and intervention measures before implementation. In social science, synthetic population datasets are needed to understand how policy deci...
متن کاملGenerating datasets with drift
Modern challenges in machine learning include non-stationary environments. Due to their dynamic nature, learning in these environments is not an easy task, as models have to deal both with continuous learning process and also with the acquisition of new concepts. Different types of drift can occur, as concepts can appear and disappear with different patterns, namely sudden, reoccurring, increme...
متن کاملFoodBroker - Generating Synthetic Datasets for Graph-Based Business Analytics
We present FoodBroker, a new data generator for benchmarking graph-based business intelligence systems and approaches. It covers two realistic business processes and their involved master and transactional data objects. The interactions are correlated in controlled ways to enable non-uniform distributions for data and relationships. For benchmarking data integration, the generated data is store...
متن کاملThe Data Mining Triclustering algorithm for mining Real Valued Datasets -A Review
Cluster analysis has been widely used in several disciplines, such as statistics, software engineering, biology, psychology and other social sciences, in order to identify natural groups in large amounts of data. These data sets are constantly becoming larger, and their dimensionality prevents easy analysis and validation of the results. The subspace pattern mining has been tailored to microarr...
متن کاملAn empirical evaluation of easily implemented, nonparametric methods for generating synthetic datasets
When intense redaction is needed to protect data subjects’ confidentiality, statistical agencies can release synthetic data, in which identifying or sensitive values are replaced with draws from statistical models estimated from the confidential data. Specifying accurate synthesis models can be a difficult and labor intensive task with standard parametric approaches. We describe and empirically...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: BMC Bioinformatics
سال: 2021
ISSN: 1471-2105
DOI: 10.1186/s12859-020-03925-4